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ABSTRACT 

We perform an analysis of the spatial clustering properties of HI selected galaxies from 
the HI Parkes All Sky Survey (HIPASS) using the formalism of the halo occupation 
distribution (HOD). The resulting parameter constraints show that the fraction of 
satellite galaxies (i.e. galaxies which are not the central member of their host dark 
matter halo) among HIPASS galaxies is < 20%, and that satellite galaxies are therefore 
less common in HIPASS than in optically selected galaxy redshift surveys. Moreover 
the lack of fingers-of-god in the redshift space correlation function of HIPASS galaxies 
may indicate that the HI rich satellites which do exist are found in group mass rather 
than cluster mass dark matter halos. We find a minimum halo mass for HIPASS 
galaxies at the peak of the redshift distribution of M ~ IO^^Mq, and show that less 
than 10% of baryons in HIPASS galaxies are in the form of HI. Quantitative constraints 
on HOD models from HIPASS galaxies are limited by uncertainties introduced through 
the small survey volume. However our results imply that future deeper surveys will 
allow the distribution of HI with environment to be studied in detail via clustering of 
HI galaxies. 



Key vifords: cosmology: large scale structure, observations - 
- radio lines: galaxies 



galaxies: halos, statistics 



1^ ' 1 INTRODUCTION 



The cosmic star-formation rate has declined by more than 
an order of magnitude in the past 8 billion years (Lilly et 
al. 1996, Madau et al. 1996), a trend that is observed across 
all wavelengths (Hopkins 2004 and references therein). Why 
this decline has taken place, and what drives it are two of 
the most important unanswered questions in our current 
understanding of galaxy formation and evolution. One of 
the issues that will need to be addresses in order to answer 
this question is the role of environment. In cold dark matter 
cosmologies, gas cools and collapses to form stars within 
gravitationally bound halos of dark matter. These galaxies 
can then grow via continued star formation or via mergers 
with other galaxies. As galaxies of a given baryonic mass 
can only reside within dark matter halos above a particular 
dark matter mass, galaxies are biased tracers of the overall 
dark matter distribution. 

In linear theory, the bias in the spatial clustering of 
dark matter halos relative to the underlying mass distri- 
bution is a function of halo mass but not of spatial scale. 
As a consequence, if the mass power-spectrum is known. 



the clustering of galaxies on large scales yields strong con- 
straints on the masses of the dark matter halos in which 
they reside. On smaller scales, the simple relationship be- 
tween galaxy clustering and halo mass breaks down. Firstly 
the mass power-spectrum is in the non-linear regime. More 
importantly, multiple galaxies can be distributed within in- 
dividual halos (at separations <1 Mpc), with the number of 
galaxies within halos of a given mass exhibiting some scat- 
ter. While this complicates the modelling of galaxy cluster- 
ing, it enables measurements of spatial clustering of galaxies 
to determine how galaxies populate dark matter halos as a 
function of halo mass. Some understanding of these issues is 
provided by simulations and these can be (and have been) 
tested against observations of the spatial clustering of opti- 
cally selected galaxies. 

By understanding how galaxies populate dark matter 
halos, key insights may be obtained into how galaxies grow 
over cosmic time. For example, while the merger rate of dark 
matter halos is known, modelling the dynamical friction of 
sub-halos (and thus galaxies) in cosmological simulations is 
non-trivial and the rate of galaxy growth via merging has 
been uncertain as a consequence. Knowing how galaxies pop- 
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ulate dark matter halos resolves this problem. In particular, 
consider a case where the timescale for dynamical friction 
following the merger of two dark matter halos is short com- 
pared with the Hubble time. In this case galaxies within 
these halos will also merge soon after, and satellite galaxies 
will be relatively rare. On the other hand, if the dynamical 
friction timescale is long, then the galaxies within these ha- 
los may remain as satellite galaxies for many Gyr. In this 
case satellite galaxies will be relatively common. Brown et 
al. (2008) show that the later scenario holds for the most 
massive dark matter halos, with much of the stellar mass in 
massive halos residing within satellite galaxies. 

The way in which stellar mass populates dark matter 
halos has, to first order, been determined for optically se- 
lected galaxy samples. However little is known about how 
HI, the fuel for star-formation, populates group and cluster 
mass dark matter halos. HI galaxies in the Fornax region 
have been studied by Waugh et al. (2002), who found very 
few galaxies to be associated with the Fornax cluster. None 
of the HI detections in Waugh et al. (2002) are early-type 
galaxies. Moreover, only 2 of the HI detections have both 
Fornax redshifts and are within 1 degree (~ 300 kpc) of the 
cluster centre. These results may suggest that there is a cen- 
tral galaxy high mass cut-off near the Fornax cluster halo 
mass (which is 7 x 10^'' M0 according to Drinkwater et al. 
2001). More recently Cortese et al. (2008) have used Arecibo 
to survey a 5 square degree region around Abell 1367. They 
find a uniform distribution of Hl-selected galaxies through- 
out the volume (i.e. when observed in HI the Abell cluster 
1367 disappears), and that HI deficiency does not vary sig- 
nificantly with cluster-centric distance. These authors also 
find no finger-of-god effect in the Hl-selected galaxies (in a 
redshift-position diagram, rather than in a clustering anal- 
ysis). Similarly, Verheijen et al. (2007) study Abell 963 and 
2192 at z = 0.2 and find only one Hl-selected galaxy within 
IMpc from the centre of each cluster. On the other hand, de 
Blok et al. (2002) find that there are HI galaxies in Sculptor 
and Centaurus with HI masses of ~ IO^Mq. However these 
clusters have dynamical masses ~ 1.5 orders of magnitude 
lower than that of the Fornax and Abel clusters discussed 
above, and the identification of these with the clusters is not 
definitive. 

Thus there are many questions. For example, is HI 
stripped from galaxies entering cluster, group or lower mass 
halos? Is there a dark matter halo mass above which HI 
is heated or removed from galaxies? Is the HI content of 
galaxies largely a function of galaxy stellar mass or host 
dark matter halo mass? Do the stellar masses of HI selected 
galaxies grow largely via star-formation or galaxy mergers? 
These questions can be addressed using the observed clus- 
tering of HI selected galaxies to constrain models of how 
HI populates dark matter halos. A popular formalism for 
modeling clustering on small to large scales is termed the 
halo occupation distribution (HOD) model (e.g. Peacock & 
Smith 2000; Seljak 2000; Scoccimarro et al. 2001; Berlind & 
Weinberg 2002; Zheng 2004; Zehavi et al. 2004). The HOD 
model includes contributions to galaxy clustering from pairs 
of galaxies in distinct halos which describes the clustering in 
the large scale limit, and from pairs of galaxies within a sin- 
gle halo which describes clustering in the small scale limit. 
The latter contribution requires a parametrisation to relate 
the number and spatial distribution of galaxies within a dark 



matter halo of a particular mass. Measurements of the HOD 
of optically selected galaxies provide some insights into how 
galaxies evolve. For example, in the most massive dark mat- 
ter halos, central galaxy stellar mass is proportional to halo 
mass to the power of approximately ~ 1/3. Much of the stel- 
lar mass within these halos resides within satellite galaxies 
(e.g.. Brown et al. 2008, Moster et al. 2009). This resuh im- 
plies that the mergers of dark matter halos do not always 
lead to mergers of galaxies, and as a consequence massive 
galaxy growth is slow relative to the rapid growth of dark 
matter halos. Whether this result is also true for lower mass 
star-forming galaxies is unknown at this time. 

In recent years large galaxy redshift surveys such as 
SDSS and the 2dFGRS have enabled detailed studies of the 
clustering of in excess of 100000 optically selected galaxies 
in the nearby universe. By comparison, the largest survey 
of HI selected galaxies contains ~ 5000 sources, obtained 
as part of the HI Parkes All Sky Survey (HIPASS, Barnes 
et al. 2001), a blind HI survey of the southern sky. Meyer 
et al. (2007) have studied the clustering of these HI galax- 
ies. Their analysis reached the conclusion of weak clustering 
of HI galaxies based on parametric estimates of correlation 
length (see also Basilakos et al. 2007), but did not study the 
clustering in terms of the host dark matter halo masses of 
the HIPASS sample. In this paper we revisit the clustering 
of HIPASS galaxies using the HOD model. There are sys- 
tematic uncertainties in estimation of the observed cluster- 
ing amplitude, arising from the selection function and small 
survey volume (Meyer et al. 2007), and our results show 
that this limits the precision with which conclusions from 
clustering can be made. Nevertheless we illustrate that the 
clustering of HIPASS galaxies already provides interesting 
constraints on the distribution of HI within the dark matter 
halo population. 

This paper is organised as follows. We begin by sum- 
marising the clustering of HI galaxies in the HIPASS sur- 
vey § O We then summarise the formalism for the real and 
redshift-space HOD models for the correlation function (§[3]), 
which is discussed relative to the HIPASS observations in §0] 
and § [S] respectively. We discuss the satelite fraction in § [S] 
and summarise our findings in § [T] In our numerical exam- 
ples, we adopt the standard set of cosmological parameters 
(Komatsu et al. 2008), with values of = 0.24, f^b = 0.04 
and Q.Q = 0.76 for the matter, baryon, and dark energy frac- 
tional density respectively, h = 0.73, for the dimensionless 
Hubble constant, and erg = 0.81 for the variance of the linear 
density field within regions of radius 8/i~^Mpc. 



2 CLUSTERING OF HIPASS GALAXIES 

Meyer et al. (2007) computed the redshift space correla- 
tion function of HI selected galaxies from 4315 detections in 
the HIPASS catalogue (HICAT; Barnes et al. 2001; Meyer 
et al. 2004; Zwaan et al. 2004). Correlation functions were 
produced by weighting each galaxy pair equally (termed 
unweighted), and by weighting each pair in a way that 
corrects for the survey selection function and minimises 
the variance in the correlation function estimate (termed 
weighted). From the redshift space correlation function, 
Meyer et al. (2007) computed the real space correlation func- 
tions in both the weighted and unweighted cases, using in- 
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Figure 1. Plot of HI mass verses dynamical mass for HI galaxies in HIPASS. The left hand panel plots the HI mass against a dynamical 
mass estimated from the circular velocity using the methods of M. Verheijen, and by Kochanek & White (2001) respectively. The linear 
line shows the upper limit on HI mass of Mm < Q]^/Qj^M^y^. 



versions that were both non-parametric and which assumed 
a powerlaw. 

In this paper we restrict our attention to non- 
parametric estimates of the real space correlation function. 
However given the sensitivity of the measured clustering to 
the weighting scheme adopted, we fit both the unweighted 
and the weighted real-space correlation function from Meyer 
et al. (2007). From their estimated correlation function, 
Meyer et al. (2007) calculate a correlation length for the 
HIPASS galaxies. In this paper our aim is instead to in- 
terpret the astrophysical context of the measured cluster- 
ing, namely the distribution of HI galaxies within the dark- 
matter halo population and the typical dark matter halo 
mass. 



2.1 Density of HIPASS galaxies 

Constraints on HOD models are provided both by the clus- 
tering of galaxies, and by the density of galaxies via compar- 
ison with the dark-matter halo mass function. We estimate 
the space density of HIPASS sources from the HI mass func- 
tion (Zwaan et al. 2005a), yielding 

ngai = 0.r{l + a, Mm,iim/Mm,.), (1) 

where MHi,iim is the lowest HI mass included in the cal- 
culation of space densities, and the parameters have mea- 
sured values of Of = -1.37, 9-, = 0.0060Mpc"^, and 
Mm,* = IO^-^Mq. If all HIPASS galaxies with HI masses 
> lO'^M© were included, the space density would be rigai ~ 
0.15Mpc~'^. However HI masses of IO^Mq can only be de- 
tected out to very small distances in HIPASS, and so are 
not really represented in the calculation of the correlation 
function. A better estimate is obtained by looking at the 
peak of the redshift distribution, where the typical HI mass 
is ~ lO^'^^M©. The space density for HI masses larger than 
1O^-^'^M0 is rigai ~ 0.0069Mpc~^. We estimate the error on 
this value to be ~ 15%. 



2.2 Dynamical masses of HIPASS galaxies 

An analysis of the observed clustering of a galaxy population 
based on the bias of dark-matter halos implicitly assumes a 
relationship between galaxy luminosity (or in this case HI 
mass) and the host halo mass. Before proceeding to discuss 
the formalism for the model of halo clustering we there- 
fore describe the relation between HI mass and dynamical 
mass for galaxies in the HIPASS survey. The dynamical mass 
Mdyn of the HI galaxies was estimated using the circular ve- 
locity (Vc) derived from the width of the HI spectrum using 
two methods. Firstly, based on the work of Marc Verheijen 
(PhD thesis), we have estimated the mass of a dark mat- 
ter halo with a Hernquist (1990) profile using the relation 
Mdyn = 10^°i?(V'c/103.9kms"^)^Mo, where for the radius 
R, we have adopted the B-band Kron radius (measured in 
kpc). The resulting relation is shown in the left hand panel 
of Figure [T] Secondly, we have also estimated Mdyn from Vc 
based on Kochanek & White (2001), with results plotted in 
the right hand panel of Figure [1] Each panel includes a lin- 
ear relationship to guide the eye, showing the upper limit on 
HI mass A/hi ^ ^Ih/^^uMdyn- These panels illustrate that 
while there is significant scatter, larger HI masses are found 
in more massive host halos. Figure [T] illustrates that the re- 
lationship between is HI and dynamical mass is shallower 
than linear, with Mm oc Mjy^ where 7 ~ 0.5 — 0.7. These 
dynamical masses are defined such that they are comparable 
to a definition based on the volume which encloses mass at 
~ 200 times the mean density of the Universe. 

The largest dynamical mass among the HIPASS sam- 
ple is ~ IO^^Mq. For comparison, we expect a number 
N ~ Fhipass X Mdn/dM = 300 of halos in the HIPASS 
volume Vhipass, where we calculate dn/dM using the Sheth- 
Tormen (2002) mass function. This yields N - 300, 30 and 
1 for masses of M = IO^Mq, lO^^M© and IO^'^Mq respec- 
tively. Figure [T] shows that the observed number of these 
massive halos is much lower than the mass function pre- 
dicts, although they should be detectable throughout the 
HIPASS search volume. Thus it appears that the most mas- 
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sive halos in the HIPASS volume do not host an HI galaxy 
which traces the halo potential. 



3 HOD MODELS 

In this paper we model the clustering of HIPASS galaxies 
using the halo occupation distribution formalism (HOD; e.g. 
Peacock & Smith 2000; Seljak 2000; Scoccimarro et al. 2001; 
Berlind & Weinberg 2002; Zheng 2004). Our approach is 
to fit HOD parameters for the non-parametric estimate of 
the real space HIPASS correlation function. Based on these 
fits, we then calculate the redshift-space correlation function 
using the analytic formalism described in Tinker (2007). In 
this section we describe the HOD modeling formalism briefly 
to provide context for the particular parametrisation used, 
and refer the reader to the above papers for details. 



3.1 The real-space HOD model 

The HOD model is constructed around the following simple 
assumptions. First, one assumes that there is either zero or 
one central galaxy that resides at the centre of each halo. 
Satellite galaxies are then assumed to follow the dark matter 
distribution within the halos. The mean number of satellites 
is typically assumed to follow a power-law function of halo 
mass, while the number of satellites within individual halos 
follows a Poisson (or some other) probability distribution. 

The two-point correlation function can be decomposed 
into one-halo and two-halo terms 



CW = [l + ?ih(r)] + U(r), 



(2) 



corresponding to the contributions to the correlation func- 
tion from galaxy pairs which reside in the same halo and in 
two different halos respectively (Zheng 2004). In real space 
the 1-halo term can be computed using (Berlind & Wein- 
berg 2002) 



dM 



27rr2n| 
dn {N{N-1))m 



dM 



2i?vir(M) 



F' 



(3) 



,2i?v 

Here fig is the mean number density of galaxies. We as- 
sume the Sheth-Tormen (1999) mass function dn/dM using 
parameters from Jenkins et al. (2001) throughout this pa- 
per. The distribution of multiple galaxies within a single 
halo is described by the function F{x) which is the cumu- 
lative fraction of galaxy pairs closer than x = r/Rvir- The 
contribution to F is divided into pairs of galaxies that do, 
and do not involve a central galaxy, and is computed as- 
suming that galaxies follow the number-density distribution 
of a Navarro, Frenk & White (1997; NFW) profile (see e.g. 
Zheng 2004). The quantity {N{N—1))m is the average num- 
ber of halo pairs. We assume an average distribution, with 
{N{N - 1))m = {N}1, - {N)m. 

The 2-halo term can be computed as the halo correla- 
tion function weighted by the distribution and occupation 
number of galaxies within each halo. The 2-halo term of the 
galaxy power-spectrum is 



Pgt(fc) = Pn.{k) 



dM—{N)Mb^{M)y^{k,M) 



where Pm is the mass power-spectrum and is the nor- 
malised Fourier transform of the galaxy distribution pro- 
file (i.e. NFW). To compute the halo bias h{M) we use the 
Sheth, Mo and Tormen (2001) fitting formula. The quantity 
Afmax is taken to be the mass of a halo with separation 2r. 
The 2-halo term for the correlation function follows from 



6.(0 = 2^ 



(5) 



On large scales the correlation function is sensitive only 
to the 2-halo term, and only to the number weighted galaxy 
bias. However on small scales, both the 1-halo and 2-halo 
terms contribute to the clustering, and the detailed shape 
of the correlation function is sensitive to the distribution of 
galaxies within halos. We use the following parametrisation 
to describe this distribution. Halos are assumed to host a 
single central galaxy and a number A^sat of satellite galaxies 
if their mass is in excess of Afmin. The number of satel- 
lites is taken to be a powerlaw in mass with characteristic 
mass scale Mi and index a. However, motivated by the fact 
that HI galaxies seem to be underrepresented as satellites in 
galaxy clusters (Waugh et al. 2002), we also include an up- 
per limit for the halo mass which can contain an HI satellite 
(Ml, max). Thus the mean occupation of a halo of mass M is 
assumed to be 

{N)m = l + (M)eat if M>M,^in 

— otherwise, 

where the number of satellites is defined to be 
M 

= otherwise. 



a Mmin < M < Ml,, 



3.2 The redshift space HOD model 

Tinker (2007) has extended the above model to calculate 
the redshift space correlation function for a given HOD 
parametrisation. The redshift space model is again com- 
puted based on the sum of 1-halo and 2-halo terms as in 
equation ([2]). In redshift space, the apparent recessional ve- 
locity of a galaxy is the sum of its motion in the Hubble 
flow (directly related to its physical distance), and of pecu- 
liar velocity (which modifies the apparent distance based on 
Hubbies constant). The 1-halo term is computed in analogy 
with equation (|3j, but with an additional integral over the 
line-of-sight distance and a probability distribution for the 
line of sight peculiar velocity. The result of these peculiar 
motions are the so-called fingers-of-god, large line-of-sight 
features in redshift. Tinker (2007) suggests that the 2-halo 
term of the redshift space correlation function at apparent 
line-of-sight (ra) and transverse (r^) distances is most eas- 
ily computed by integrating over the 2-halo term of the real 
space correlation function. 



l + 6h(r„ 



/OO 
[1 + ^2hir)]P2h{v,\r,(t>)dv, 
• OO 



(6) 



(4) 



where P2h{vz\r, (f)) is the probability density for the line- 
of-sight velocity between pairs in two distinct halos, and 
coscjf) — Tajr. Here = r^, and Vz = -ff(r^ — z). Cal- 
culation of P2h, including determination of fitting formulae 
to N-body simulations is complex and we refer the reader to 
Tinker (2007) for details. 
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4 REAL SPACE HOD MODELS OF HIPASS 
GALAXIES 

In this section we describe fitting of HOD models to the 
real-space correlation function of HIPASS galaxies. Given 
the systematic uncertainty in the estimate of the correla- 
tion function owing to the small survey volume we follow 
Meyer et al. (2007) and choose to fit both the unweighted 
and weighted HIPASS correlation functions (although we 
note that the latter should more fairly estimate the correla- 
tion function that would be obtained from a larger, volume 
limited survey). Our HOD model has four free parameters, 
Mniin, Ml, Ml, max and a, for combinations of which we com- 
pute the real space correlation function, and calculate the 
likelihood of the model as 



£(Mmi„,Mi,Mi,max,a) = exp (-xVs), 



(7) 



where 

2 _ -sfif /logC(r-,|Mmi„,Mi,Mi 

,max 5 

a) — log^obs('"i) ^ 

/logng(Minin,Ml,Ml,niax,a) - log Tlgal 



O-gal 



.(8) 



Here ^obs is the observed correlation function measured at 
a number (A'obs) of radii r^, with uncertainty (Tobs(''0 (in 
dex), and Wgai is the observed galaxy density with uncer- 
tainty (Tgai (in dex). We compute the halo density fig using 
the Sheth-Tormen (2002) mass function as part of the HOD 
model. The error bars on the observational estimates are 
not symmetric. Note that we assume the correlation func- 
tion points at different radii to be independent (as should be 
the case for a small sample, with large Poisson dominated 
noise) . Covariance between measurements of the correlation 
function at different radii can lead to unrealistically small 
errors on constrained HOD model parameters. We do not 
add this layer of sophistication to our analysis, owing to 
the large uncertainties already introduced into the cluster- 
ing measurements via the chosen weighting scheme. 

We begin by fitting our HOD model to the unweighted 
real-space clustering of HIPASS galaxies. The upper row of 
the upper set of panels in Figure [2] shows contours of the like- 
lihood in 2-d projections of this 4-d parameter space. Here 
prior probabilities on a, log Mmin, log Mi and log Mi, max are 
assumed to be constant. The contours are placed at 60%, 
30% and 10% of the peak likelihood (the location of which 
is marked by a dot). The lower row shows the corresponding 
marginalised likelihoods on individual parameters. Meyer et 
al. (2007) noted that the correlation length of HIPASS galax- 
ies is smaller than for optical surveys. Here we quantify the 
clustering on large scales via the host halo mass, finding a 
value of Mmin ~ lO^^'^^^'^Mo. On smaller scales, the halo 
occupation modeling illustrates the requirement of a non- 
zero 1-halo term in order to reproduce the excess clustering 
of galaxies at r < IMpc. We find Mi ~ 1O" ''±°-^M0, which 
is two orders of magnitude larger than Mmin. The power-law 
index is tightly correlated with Mi , but loosely constrained 
to be Of > 1. Since Mi represents the characteristic mass 
where satellites outnumber the central galaxies, the large 
value of Ml indicates that there are only a small number of 
satellite galaxies in the HIPASS sample. 

In the lower set of panels in Figure [2] we repeat this 
analysis for the weighted estimate of the HIPASS real- 



space correlation function. Here we find best fit estimates 
of Mmin ~ lO"-^±''-^M0, and Mi ~ lO^^'^^^ '^Mo. There 
is greater tension between the galaxy density and cluster- 
ing amplitude in this case leading to larger values of for 
the best fit. We find Mi ~ lOMmin, smaller than the differ- 
ence found in the unweighted case. However the value of the 
power-law slope is loosely constrained to be a ~ 0.7 ± 0.4, 
weakly preferring satellites to be in smaller halos (but con- 
sistent with a linear relation). In this case Mi and a are 
again tightly correlated, with a smaller value of Mi associ- 
ated with a shallower index a in order to produce the low 
amplitude of the small scale clustering. 

If a is forced to equal unity in our analysis, then we 
find Ml = 10"Mo ~ 50- lOOAfmin for both the unweighted 
and weighted estimates. For comparison, with a = 1, the 
red galaxy sample (chosen to exclude gas rich galaxies with 
a large star formation rate) from Brown et al. (2008) has 
clustering described by Mi ~ 3Mmin, while clustering of 
galaxies in the Sloan Digital Sky Survey (including both 
gas-rich and gas-poor galaxies) suggests Mi ~ 20Afmin (Ze- 
havi et al. 2005). Thus the qualitative conclusions of both 
the weighted and unweighted estimates of the HIPASS cor- 
relation function are consistent; namely that HI satellites 
in groups and clusters are rare compared to the results of 
optical clustering studies. We return to quantify this point 
further in § [S] The effect of satellites on the real space cor- 
relation function at small scales is illustrated in Figure 5 of 
Meyer et al. (2007), where it can be seen that HI selected 
HIPASS galaxies have a smaller correlation length than op- 
tically selected samples, but also that the difference in am- 
plitude of the correlation function is greatest at scales less 
than IMpc, where the 1-halo term dominates. Thus, by de- 
termining the relationship between Mmin and Mi, the real 
space HOD correlation function quantifies previous sugges- 
tions that HI galaxies are under-represented in overdense 
environments (Waugh et al. 2002). 

The inferred values of Afmin for the HIPASS galaxies 
are quantitatively consistent between the unweighted and 
weighted clustering estimates, making estimates of the halo 
mass for HIPASS galaxies fairly robust (we note that the 
estimates partly driven by the galaxy density, which is com- 
mon between the two cases). Moreover, the clustering esti- 
mate of host mass from the unweighted HIPASS correlation 
function is easily reconciled with the dynamical estimates 
of HIPASS galaxy host masses shown in Figure [T] for which 
the logarithmic means are {\og-^Q{M /Mq)) = 11.1 for both 
of the dynamical mass estimates presented. 



4.1 The HI mass fraction in HIPASS galaxies 

The halo mass estimates derived from the combination of 
clustering and density of HI galaxies allow the fraction of 
baryonic mass in galaxies that is in the form of HI (/hi) to 
be estimated. To this end we first assume that the hydrogen 
to dark-matter mass ratio is the same within galaxies as in 
the mean universe, so that the total hydrogen mass within 
a halo of mass Al is ~ Q^/Q-mM . We then assume that the 
baryon to dark matter mass is the same for all halos, yielding 



/h 



0.M Mhi, lim 



(9) 
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Un-weighted HIPASS correlation function 





Figure 2. Constraints on HOD parameters describing estimates of the non-parametric real-space HIPASS correlation function from 
Meyer et al. (2007). Two sets of constraints are shown, based on the unweighted (upper set) and weighted (lower set) estimates of the 
HIPASS correlation function respectively. In each case, the Upper panels show contours of the likelihood in 2-d projections of the 4-d 
parameter space used for the HOD modeling, while the Lower panels show the marginalised likelihoods on individual parameters. Here 
prior probabilities on a, logM^i^, log Mi and logMi^max are assumed to be constant. The contours are placed at 60%, 30% and 10% of 
the peak likelihood (the location of which is marked by a dot). 
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Un-weighted HIPASS correlation function 
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Figure 3. Examples of correlation functions that are consistent with the HOD parameter constraints in Figure [2] derived from the 
estimates of the unweighted real-space HIPASS correlation function. The Left-hand panels show modeled real-space correlation functions 
(with the 1-halo and 2-halo terms plotted as dotted curves). The non-parametric determinations of the real-space correlation function for 
HIPASS galaxies are plotted as the data points with error bars. The parameters used for each model are listed together with the resulting 
value of ■ The Central panels show the corresponding total (solid lines) and satelite (dashed lines) occupation numbers of galaxies as 
a function of halo mass. The thick contours in the Right-hand panels show the corresponding redshift space correlation functions. The 
black contours correspond to § = 1, with the remaining contours differing in level by factors of 2. The model correlation function has 
been smoothed at 0.5/i~^Mpc for comparison with the data. The thin contours are the unweighted redshift-space correlation function 
for HIPASS galaxies (from Meyer et al. 2007). 



Including the systematic uncertainty as estimated by the 
differing results for M^in from the unweighted and weighted 
clustering measurements, we find /hi ~ lO"^ **'''^. Thus 
we find that less than 10% of baryons within HI selected 
galaxies exists in the form of HI. 



5 REDSHIFT SPACE HOD MODELS OF 
HIPASS GALAXIES 

The line-of-sight structure of the redshift-space correlation 
function is dominated by gravitational infall on large trans- 
verse scales, and by virial motions of satellites on small 



transverse scales. Both of these features can be seen in 
the unweighted HIPASS redshift space correlation function 
(plotted as the thin contours in the right hand panels of 
Figures [3] and though the fingers-of-god are less pro- 
nounced than expected based on optical galaxy redshift sur- 
veys (Meyer et al. 2007) . As mentioned above in § [51 the 
small volume of the HIPASS survey suggests that the cor- 
relation function should be constructed using a weighting 
scheme so that it is not dominated by galaxy pairs near the 
peak of the selection function. However this weighting in- 
troduces systematic uncertainty into the determination of 
the correlation function. The weighted redshift space corre- 
lation functions (plotted as the thin contours in the right 
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Figure 4. As per Figure |3] but showing examples of correlation functions that are consistent with the HOD parameter constraints 
in Figure [2] derived from the estimates of the weighted real-space HIPASS correlation function. The thin contours are the weighted 
redshift-space correlation function for HIPASS galaxies (from Meyer et al. 2007). 



hand panel sets of Figure |4]| show evidence for infall, but 
marginal or no evidence for fingers-of-god. 

Additional information on the satellite galaxy distribu- 
tion is contained in the redshift space correlation function. 
In redshift space, the line-of-sight structure of the 2-halo 
term is governed by gravitational infall, while the 1-halo 
term is dominated by the virial motions of satellite galax- 
ies producing the so-called fingers-of-god. In this section we 
turn to calculation of the redshift space correlation func- 
tion using the analytic HOD model of Tinker (2007). Given 
the large uncertainties in the construction of the HIPASS 
correlation function, we do not fit the redshift space cor- 
relation function directly. Rather, based on the parameter 
constraints in Figure [2] we calculate examples of the redshift 
space correlation function for qualitative comparison with 
the HIPASS clustering. These examples, and their compar- 
ison with the HIPASS redshift space correlation function, 
offer some hints regarding the satelite distribution that are 



not available from the real space correlation function alone. 
They also indicate the way in which the full 3-dimensional 
shape of the correlation function could be utilized within a 
larger, more statistically representative sample. 

Three examples are shown in each of Figures[3]and|4]for 
comparison with each of the unweighted and weighted deter- 
minations of the HIPASS correlation function. The chosen 
HOD models have parameters which adequately describe the 
real-space clustering, as shown in the left hand panels. In 
each case the models differ in the values chosen for various 
parameters. These values effect the occupation of dark mat- 
ter halos as shown in the central panels of Figures [3] and U 
For example, smaller values of a and Mi preferentially place 
the required number of satellites in smaller halos, and so re- 
duce the prominence of the fingers-of-god. A smaller value 
of M\ also lowers the typical mass at which satellites be- 
come common, and so increases the fraction of galaxies that 
are satellites (the fractions are listed in the central panels). 
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These two parameters are varied between the upper 2 panels 
of Figures [3] and |4]) . In the weighted case, the models also 
differ in the value of Mmin, with decreasing values from top 
to bottom. Larger values of Mmin (and hence larger values 
of bias) lead to smaller values of /3 = /h, and in turn to a 
real-space correlation function that is less compressed along 
the line of sight on large transverse scales (as can be seen in 
the correlation functions of Figures [3] and |4]) . However the 
modeled fingers-of-god are more prominent than is the case 
in the HIPASS data for each of these cases. 

In the lower panels of Figures |3] and |4] we show exam- 
ples that impose an upper limit on the host mass contain- 
ing satellite galaxies. By excluding the presence of satel- 
lites in massive halos, the values of Mi, max ~ lO^'^ '^Mo and 
Ml, 

max — 

IO^ '^Mq in the unweighted and weighted cases 
force the required number of satellites to reside in smaller 
halos. This reduces the prominence of the fingers-of-god, 
which are sensitive to the magnitude of satellite virial mo- 
tions within the host halo. As a result these models yield 
fingers-of-god which are of comparable strength to those 
seen in the HIPASS data. On the other hand, these same 
fits to the unweighted estimate of the real-space correlation 
function predict line-of-sight compressions at large trans- 
verse separations [Kaiser (1987) effect] that appears to be 
too large to explain the HIPASS datcQ- In the weighted case 
the correlation function amplitude is below the observed es- 
timate owing to the tension between the density and corre- 
lation function amplitudes in this case. 
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Figure 5. Likelihood functions (per unit logarithm) for the frac- 
tion of satellites among the HIPASS galaxies. 



6 SATELITE FRACTION 

Taken together the results of our modeling suggest that HI 
rich satellite galaxies are not common in HIPASS, or else 
the 1-halo term would be more prominent in the real-space 
correlation function. This is quantified in Figure O where 
we show the likelihoods (per unit logarithm) for the ratio 
{N)sa.t/ {N)m obtained by marginalising over the HOD dis- 
tributions shown in Figures [3] and [3] for the unweighted and 
weighted HIPASS correlation functions respectively. A range 
of HOD models can describe the HIPASS real space corre- 
lation function, and our fits include a range of values with 
means near ~ 3% and ~ 10% for the fraction of satellites in 
the unweighted and weighted cases. Although unlikely, we 
find that the weighted estimate of the real space correlation 
function can be described with HOD models for which the 
satelite fraction is greater than 20%. However we find that 
HOD models which have more than 20% of the galaxies as 
satellites have fingers-of-god that are too prominent (e.g. 
see Figure ID). The satelite fraction of (iV)sat/{iV>A/ ~ 0.20 
should therefore be considered an upper limit for HIPASS 
galaxies. 

For comparison, typical fits to the halo occupation dis- 
tribution of optical samples have satellite fractions that vary 

^ Note that we have plotted the redshift space correlation func- 
tion using reflections of the measured correlation in the first quad- 
rant to fill the remaining three quadrants. As a result, features 
due to noise in the correlation function are repeated and could 
give the impression of a systematic difference between the shape 
of the data and model correlation functions where no statistically 
significant difference exists. 



with galaxy luminosity and type. For example, the HOD 
modeling of Brown et al. (2008) implies a satelite galaxy 
fraction of {N) sai / {N) m ~ 0.5 among red galaxies with 
0.2 < 2 < 0.4 and a comparable space density to HIPASS. 
This suggests that red galaxies (which are HI poor) are more 
common among the satelite population than HI selected 
galaxies. On the other hand, for galaxies in the Sloan Digital 
Sky Survey with r-band absolute magnitudes in excess of -19 
(again a sample with a comparable density to HIPASS galax- 
ies) the HOD parametrization found in Zehavi et al. (2005) 
implies a satelite fraction of {N)sz.t/{N)M ~ 0.25. This value 
lies between the fraction we find from HIPASS, and the 
fraction found for red galaxies (Brown et al. 2008). Zehavi 
et al. (2005) divide their galaxy population into blue and 
red galaxies. They find that the red galaxy population has 
a steeper correlation function, which, when interpreted in 
terms of the HOD model implies that satelite galaxies are 
rarer among blue galaxies than among red galaxies. Thus 
there appears to be a sequence of satellite fractions. A sub- 
sample of red galaxies includes a larger proportion of satel- 
lites than does a sub-sample of blue galaxies, which in turn 
has a larger proportion of satellites than an HI selected sub- 
sample of galaxies. 

Thus, as with observations of optical galaxy clustering, 
studying how HI galaxies populate dark matter halos pro- 
vides important insights into how galaxies are assembled 
and evolve over cosmic time. For example, if massive galax- 
ies grow largely via galaxy mergers rather than in-situ star 
formation, then star forming galaxies with large HI masses 
will be largely absent from the most massive dark matter 
halos. HI selected satellite galaxies will also be rare if HI 
galaxies merge rapidly after the merger of their host dark 
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matter halos. Similarly, if HI is consumed or removed from 
satellite galaxies within dark matter halos, then satellite 
galaxies would be undcr-rcprosontcd in HI surveys relative 
to optical surveys as seems to be the case beised on our anal- 
ysis of HIPASS. Although our HOD results axe suggestive 
of these scenarios, the precision with which the HI HOD 
can be studied with HIPASS is limited. However the much 
larger volumes that will become available with the advent of 
deeper HI surveys such as those to be undertaken with the 
AustraUan SKA Pathfinder (ASKAP, Johnston et al. 2008) 
will allow more detailed comparison of the halo occupation 
of stars and HI. This will in turn facilitate formulation of a 
more detailed understanding of the growth of stellar mass 
in galaxies. 



7 SUMMARY 

In this paper we have analysed the clustering properties of 
HI selected galaxies from the HIPASS survey using the for- 
malism of the halo occupation distribution. Use of the HOD 
model separates the clustering amplitude into contributions 
from galaxy pairs that are in the same halo (the 1-halo term) 
and pairs that reside in different halos (the 2-halo term). The 
real-space clustering amplitude is significant on scales below 
the virial radius associated with the halo mass required to re- 
produce the clustering amplitude on large scales, indicating 
that single halo pairs are contributing a 1-halo term. How- 
ever the resulting parameter constraints show that satellite 
galaxies make up only ~ 10% of the HIPASS sample. HI 
satelite galaxies are therefore less significant in number and 
in terms of their contribution to clustering statistics than are 
satellites in optically selected galaxy redshift surveys. Thus 
HOD modeling of HI galaxy clustering quantifies the extent 
to which environment governs the HI content of galaxies and 
confirms previous evidence that HI galaxies are relatively 
rare in overdense environments (Waugh et al. 2002; Cortes 
et al. 2008). Through our real-space modeling of HIPASS 
clustering we find a minimum halo mass for HIPASS galax- 
ies at the peak of the redshift distribution of M ^ lO^'^M©, 
and show that less than 10% of baryons in HIPASS galaxies 
are in the form of HI. 

Our analysis reveals significant degeneracies in the HOD 
parameters that give acceptable fits to the real-space HI 
correlation function. However the extra line-of-sight dimen- 
sion in the redshift-space correlation function helps to break 
these degeneracies because the fingors-of-god arc sensitive 
to the typical halo mass in which satellite galaxies reside. 
Our analysis of the redshift space correlation function indi- 
cates that in order to get fingers-of-god in a model which 
are as subtle as those in the HIPASS observations, the HI 
rich satellites required to produce the measured 1-halo term 
must be preferentially in group rather than cluster mass ha- 
los. In our modeling the best representations of the fingers- 
of-god are obtained by imposing an upper limit on the halo 
mass where HI satellites are found of ~ 1O^^ '''~^*'^M0. This 
finding is in accord with direct observations of rich optical 
clusters, which show no overdensity of HI galaxies relative 
to the field (Waugh et al. 2002; Cortes et al. 2008). Quanti- 
tative constraints on HOD models from the HIPASS survey 
are limited by the small survey volume, which makes the 
determination of the correlation function systematically un- 



certain (Meyer et al. 2007). Future deeper HI surveys with 
telescopes like the Australian SKA Pathfinder (ASKAP) will 
survey a much larger volume (Johnston et al. 2008) and al- 
low the distribution of HI with environment to be studied 
in more detail via precise measurements of clustering in HI 
galaxies. 

The cosmic star-formation rate has declined by more 
than an order of magnitude in the past 8 billion years (Lilly 
et al. 1996, Madau et al. 1996). The decline is observed 
across all wavelengths (Hopkins 2004 and references therein) 
and apparently defies observational limitations such as sam- 
ple selection and cosmic variance (Westra & Jones 2008). 
Optical studies paint a somewhat passive picture of galaxy 
formation, with the stellar mass density of galaxies grad- 
ually increasing and an increasing fraction of stellar mass 
mass ending up within red galaxies that have negligible star- 
formation (e.g.. Brown et al. 2008). However optical studies 
can only address part of the picture. Currently, the combi- 
nation of direct HI observations at low redshift (Zwaan et 
al. 2005b; Lah et al 2007) and damped Lya absorbers in 
the spectra of high-redshift QSOs (Prochaska et al. 2005) 
show that the neutral gas density has remained remarkably 
constant over the age of the universe. At these levels, and 
without replenishment, HI gas would be exhausted in a few 
billion years (Hopkins et al 2008). Models incorporating gas 
infall that balances star formation and gas outflow are there- 
fore necessary to reproduce observed star formation densi- 
ties (eg. Erb 2008). The evolutionary and environmental re- 
lationships between the neutral gas which provides the fuel 
for star formation and the stars that form are central to un- 
derstanding these and related issues. The study of the halo 
occupation distribution of HI based on HIPASS galaxies pre- 
sented in this paper provides the first quantitative hints of 
this relationship. 
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